race and gender
- Oceania > Australia (0.04)
- North America > United States > Washington > King County > Redmond (0.04)
- North America > United States > New York > Erie County > Buffalo (0.04)
- (7 more...)
AdFair-CLIP: Adversarial Fair Contrastive Language-Image Pre-training for Chest X-rays
Yi, Chenlang, Xiong, Zizhan, Qi, Qi, Wei, Xiyuan, Bathla, Girish, Lin, Ching-Long, Mortazavi, Bobak Jack, Yang, Tianbao
Contrastive Language-Image Pre-training (CLIP) models have demonstrated superior performance across various visual tasks including medical image classification. However, fairness concerns, including demographic biases, have received limited attention for CLIP models. This oversight leads to critical issues, particularly those related to race and gender, resulting in disparities in diagnostic outcomes and reduced reliability for underrepresented groups. To address these challenges, we introduce AdFair-CLIP, a novel framework employing adversarial feature intervention to suppress sensitive attributes, thereby mitigating spurious correlations and improving prediction fairness. We conduct comprehensive experiments on chest X-ray (CXR) datasets, and show that AdFair-CLIP significantly enhances both fairness and diagnostic accuracy, while maintaining robust generalization in zero-shot and few-shot scenarios. These results establish new benchmarks for fairness-aware learning in CLIP-based medical diagnostic models, particularly for CXR analysis.
- North America > United States > Texas > Brazos County > College Station (0.14)
- North America > United States > Iowa > Johnson County > Iowa City (0.14)
- North America > United States > Minnesota > Olmsted County > Rochester (0.04)
- North America > Canada (0.04)
- Oceania > Australia (0.04)
- North America > United States > New York > Erie County > Buffalo (0.04)
- (10 more...)
Yet another algorithmic bias: A Discursive Analysis of Large Language Models Reinforcing Dominant Discourses on Gender and Race
Bonil, Gustavo, Hashiguti, Simone, Silva, Jhessica, Gondim, João, Maia, Helena, Silva, Nádia, Pedrini, Helio, Avila, Sandra
With the advance of Artificial Intelligence (AI), Large Language Models (LLMs) have gained prominence and been applied in diverse contexts. As they evolve into more sophisticated versions, it is essential to assess whether they reproduce biases, such as discrimination and racialization, while maintaining hegemonic discourses. Current bias detection approaches rely mostly on quantitative, automated methods, which often overlook the nuanced ways in which biases emerge in natural language. This study proposes a qualitative, discursive framework to complement such methods. Through manual analysis of LLM-generated short stories featuring Black and white women, we investigate gender and racial biases. We contend that qualitative methods such as the one proposed here are fundamental to help both developers and users identify the precise ways in which biases manifest in LLM outputs, thus enabling better conditions to mitigate them. Results show that Black women are portrayed as tied to ancestry and resistance, while white women appear in self-discovery processes. These patterns reflect how language models replicate crystalized discursive representations, reinforcing essentialization and a sense of social immobility. When prompted to correct biases, models offered superficial revisions that maintained problematic meanings, revealing limitations in fostering inclusive narratives. Our results demonstrate the ideological functioning of algorithms and have significant implications for the ethical use and development of AI. The study reinforces the need for critical, interdisciplinary approaches to AI design and deployment, addressing how LLM-generated discourses reflect and perpetuate inequalities.
- North America > United States (0.14)
- South America > Brazil > São Paulo > Campinas (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (4 more...)
- North America > United States > Michigan (0.04)
- North America > United States > Florida (0.04)
Elon Musk Updated Grok. Guess What It Said?
Earlier today, Grok showed me how to tell if someone is a "good scientist," just from their demographics. For starters, according to a formula devised by Elon Musk's chatbot, they have to be a white, Asian, or Jewish man. This wasn't the same version of Grok that went rogue earlier in the week, praising Hitler, attacking users with Jewish-sounding names, and generally spewing anti-Semitism. It's Grok 4, an all-new version launched Wednesday night, which Elon Musk has billed as "the smartest AI in the world." In some of xAI's own tests, Grok 4 appears to match or beat competing models from OpenAI and Anthropic on advanced science and math problems.
- Europe > Germany (0.07)
- North America > United States > New York (0.05)
- Europe > United Kingdom (0.05)
- (6 more...)
Popular LLMs Amplify Race and Gender Disparities in Human Mobility
As large language models (LLMs) are increasingly applied in areas influencing societal outcomes, it is critical to understand their tendency to perpetuate and amplify biases. This study investigates whether LLMs exhibit biases in predicting human mobility -- a fundamental human behavior -- based on race and gender. Using three prominent LLMs -- GPT-4, Gemini, and Claude -- we analyzed their predictions of visitations to points of interest (POIs) for individuals, relying on prompts that included names with and without explicit demographic details. We find that LLMs frequently reflect and amplify existing societal biases. Specifically, predictions for minority groups were disproportionately skewed, with these individuals being significantly less likely to be associated with wealth-related points of interest (POIs). Gender biases were also evident, as female individuals were consistently linked to fewer career-related POIs compared to their male counterparts. These biased associations suggest that LLMs not only mirror but also exacerbate societal stereotypes, particularly in contexts involving race and gender.
- North America > United States > Massachusetts (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Transportation (0.46)
- Government > Regional Government (0.46)
Auditing the Use of Language Models to Guide Hiring Decisions
Gaebler, Johann D., Goel, Sharad, Huq, Aziz, Tambe, Prasanna
AI-based systems have the potential to assist employers with many aspects of human resources (HR) management, from benefits administration to coaching and development to its most common HR use case, applicant screening. The global HR technology market based on predictive models was already rapidly growing prior to 2022, but attention to AI tools received a dramatic boost with the advent of large language models (LLMs), which are models that are highly adept at understanding, summarizing, and evaluating text data. Given the primacy of text data in the job application process, an emerging HR use case for modern LLMs is to ingest entire application dossiers--including resumes, essays, and transcripts captured from interviews--and output seemingly cogent assessments of candidates' qualifications. As hiring use cases proliferate, however, employers and policymakers are racing to establish guidelines around whether the algorithmic evaluation of candidates comports with employment discrimination law, and how to audit commonly deployed AI tools to ensure they are not discriminatory. The ethical and legal implications of using predictive tools in HR has motivated a body of academic work (Raghavan et al., 2020; Tambe et al., 2019). Policymakers have matched the attention of firms and researchers, introducing a wave of legislation governing high-stakes algorithmic decision making, and hiring in particular (e.g., New York LL 144 or Illinois 820 ILCS 42).
- Europe (0.46)
- North America > United States > New York (0.25)
- North America > United States > Texas > Harris County > Houston (0.14)
- (7 more...)
- Government > Regional Government > North America Government > United States Government (0.93)
- Law > Labor & Employment Law (0.66)
What's in a Name? Auditing Large Language Models for Race and Gender Bias
Haim, Amit, Salinas, Alejandro, Nyarko, Julian
Large Language Models (LLM) have dramatically surged in popularity over the recent years. Since the release of ChatGPT, LLMs - especially those with an accessible chat interface - have not only been used by experts, but are also becoming an increasingly common tool with significant benefits for laypeople. To that end, many commercial actors have already begun implementing LLMs in their operations, ranging from customer-facing chatbots to internal decision support systems [14, 6]. The fairness of AI algorithms, including LLMs, has been a pernicious issue, motivating a growing literature and community of AI ethics research [8]. Disparities across gender and race, among other attributes, have especially preoccupied this field [4], leading to efforts to include bias auditing as an important component of AI harm mitigation in policy discussions and regulatory frameworks [28]. Mitigating biases arising from the explicit use of race or gender in the prompt is comparatively straightforward.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Michigan (0.04)
- North America > United States > Wisconsin (0.04)
- (4 more...)
- Law > Civil Rights & Constitutional Law (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- Leisure & Entertainment > Sports (0.93)
- Education > Educational Setting (0.68)
Adding guardrails to advanced chatbots
Generative AI models continue to become more powerful. The launch of ChatGPT in November 2022 has ushered in a new era of AI. ChatGPT and other similar chatbots have a range of capabilities, from answering student homework questions to creating music and art. There are already concerns that humans may be replaced by chatbots for a variety of jobs. Because of the wide spectrum of data chatbots are built on, we know that they will have human errors and human biases built into them. These biases may cause significant harm and/or inequity toward different subpopulations. To understand the strengths and weakness of chatbot responses, we present a position paper that explores different use cases of ChatGPT to determine the types of questions that are answered fairly and the types that still need improvement. We find that ChatGPT is a fair search engine for the tasks we tested; however, it has biases on both text generation and code generation. We find that ChatGPT is very sensitive to changes in the prompt, where small changes lead to different levels of fairness. This suggests that we need to immediately implement "corrections" or mitigation strategies in order to improve fairness of these systems. We suggest different strategies to improve chatbots and also advocate for an impartial review panel that has access to the model parameters to measure the levels of different types of biases and then recommends safeguards that move toward responses that are less discriminatory and more accurate.
- Europe (0.14)
- North America > United States > District of Columbia > Washington (0.04)
- Research Report (1.00)
- Personal > Interview (1.00)
- Law (1.00)
- Health & Medicine (1.00)
- Banking & Finance (1.00)
- (3 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.36)